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1 Introduction 

Systems as relational structures. Complex systems 
arising in many areas of Computer Science can be naturally 
represented as relational structures. The state of an im- 
perative program can be specified using sets and relations 
denoted by unary and binary predicates [24, 32, 66, 8], es- 
pecially for object-oriented programs [36, 63]; a relational 
database is a finite relational structure [18, 16]; knowledge 
bases and deductive databases can also be based on predi- 
cate logic [1, 41, 53]. 

Shape analysis. Shape analysis techniques [65, 29, 33, 26, 
27, 25, 17, 40, 39, 43, 37, 55] can verify and derive precise 
properties of objects in the heap. Shape analysis is therefore 
important for reasoning about programs written in modern 
imperative programming languages. Shape analysis is also 
promising as a general-purpose verification technique, be- 
cause of its ability to reason about graphs as general struc- 
tures, and the ability to summarize properties of unbounded 
sets of objects. 

Many of the shape analysis techniques have a logical 
foundation: [65] is based on (two-valued and three-valued) 
first-order logic with transitive closure, [39, 40, 37, 55] 
is based on monadic second-order logic of trees, [26, 27] 
is based on graph grammars which are closely related to 
monadic second-order logic of trees [62]. Theorem proving 
is used in [33] to derive consequences of axioms about data 
structures. Many shape analyses perform abstract interpre- 
tation [19] to synthesize loop invariants [65, 29, 43]. 
Role logic. This paper presents role logic, a notation 
for describing properties of relational structures in shape 
analysis, databases, and knowledge bases. Role logic is an 
attempt to simultaneously achieve the simplicity of the role 
declarations of [43] with a transparent connection with the 
well-established first-order logic. 

On the one hand, the full role logic has the expres- 
sive power of first order logic with transitive closure, which 
makes it as expressive as the logic of [65, 36] and more ex- 
pressive than the original role constraints [43]. For exam- 
ple, role logic is closed under all propositional operations 
and generalizes boolean shape analysis constraints [48] . Role 
logic forirmlas easily translate into the traditional first-order 
logic notation. 

On the other hand, like the specialized notation for 
declaring roles in [43] , role logic allows natural description of 
common properties of imperative data structures with mu- 
table references. Like dynamic logics [31] and description 
logics [1], role logic allows suppressing names of variables, 
which often leads to concise specifications. The conciseness 
of role logic makes it an appealing choice for lightweight 
annotations in a programming language. 

Another property that role logic shares with description 
logics is that an interesting subset of role logic is deeid- 
able. We show the decidability of the fragment RL"^ of role 
logic in Section 4 by establishing a correspondence with the 
two-variable logic with counting [30, 57]. While many 
description logics are known to be representable in but 
are potentially weaker than C^, the fragment RL^ of role 
logic matches precisely the expressive power of C^. 
Contributions. The following are the main contributions 
of this paper: 

1. We introduce role logic, which applies the ideas of im- 
plicit arguments and deBruijn's lambda calculus no- 
tation to first order logic (Section 3). The result is 



a concise way of specifying properties of first-order 
structures that arise in shape analysis, databases, and 
knowledge bases. 

2. We define a variable-free subset RL^ of role logic (Sec- 
tion 4). We give a translation of RL^ formulas to for- 
mulas of two- variable logic with counting C^. This 
translation implies that RL^ is decidable, because is 
decidable [30] . We further give a translation of for- 
mulas to RL formulas. These two translations imply 
that RL^ is just as expressive as C^. 

3. As the main application of role logic, in Section 5.1 
we present a compositional shape analysis technique. 
We introduce a unified language for writing implemen- 
tations, specifications, and conformance claims. The 
constructs of the language denote relations on program 
states expressible in the decidable fragment RL'^. The 
analysis technique is based on generating verification 
conditions in RL^ and applying the decision procedure 
for RL^. The analysis verifies the correctness of the dy- 
namically changing referencing relationships between 
objects by showing that procedures conform to their 
specifications. By conjoining procedure specifications 
with global invariants, the analysis can also show that 
the program preserves the key data structure consis- 
tency properties necessary for the correct execution of 
the program. 

4. We present two additional applications of role logic; 

(a) wc show in Section 5.3 that a subset of role logic 
RL'^ naturally corresponds to an expressive de- 
scription logic [1, Chapter 5]; 

(b) we note in Section 5.2 that boolean shape analy- 
sis constraints [48], which can describe the basic 
structure of data-flow facts in [65] , are a subset of 
constraints expressible in role logic. 

2 Example 

To give a flavor of role logic, we present an example that 
illustrates one aspect of a client-server manager system that 
assigns clients to servers. Figure 1 is a standard object 
model that graphically displays the system, using boxes to 
represent sets, arrows to represent relations, and intervals 
N..M to represent constraints on relations. Figure 2 de- 
scribes the same system using role logic. Figure 3 presents 
a fragment of the code of the system. The code is expressed 
in an imperative language extended with specification con- 
structs. 



WaitingClients 
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Figure 1: An object model for a component of client-server 
manager 
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Globallnvariant = 

{Servers} A (disjoint Servers, Clients) A 

(partition Clients; WaitingClients, AssignedClients) A 

[[server AssignedClients' A Servers]] A 

[[clients <^ ~server]] A 

[AssignedClients card^^server] A 

[Servers => card-^clients] 

Example consequence: 

P = [WaitingClients =^ 

[-•(clients V server V ^clients V ~server)]] 

Figure 2: Global constraints of the client-server manager, 
expressed in role logic 



Global constraints. Figure 2 describes the global con- 
straints of a client-server manager system using a conjunc- 
tion of role logic formulas. There are two basic kinds of 
objects in the system: servers and clients. We model these 
objects using two disjoint sets Clients and Servers. The 
set Clients is further partitioned into the set AssignedClients 
of objects that have been assigned to servers, and the set 
WaitingClients that have not been assigned yet. The disjoint, 
partition, and other constructs of set algebra of sets and re- 
lations (n, U, \) arc definable in role logic. 

We require the set Servers to be non-empty, which 
we denote by {Servers}, with the meaning 3a;.Servers(a;). 
The constraint [[server =^ AssignedClients' A Servers]] trans- 
lates to \/x.\/y. server(a;, y) =^ AssignedClients(a;) AServers(j/). 
Namely, the brackets [ ] corresponds to a universal quanti- 
fier. An occurrence of a binary predicate (such as server) 
is implicitly supplied with the previous-innermost bound 
variable (here, x) and the innermost bound variable (here, 
y). The occurrence of an unary predicate Servers is sup- 
plied with the innermost bound variable (j/), unless the 
unary predicate is primed, in which case the previous- 
innermost bound variable (in this case x) is supplied in- 
stead. The constraint [[clients 4*- ~server]] means that the 
relation clients is the inverse of the relation server. The con- 
straint [Servers => card-^clients] translates into the formula 
Vx. Servers(x) => 3-'''j/. clients(x-, ;(y) in first-order logic with 
counting quantifiers. 

Note that all of our translations of constraints in Figure 2 
use only two variables, x and y. In fact, our entire example 
is expressed in the RL^ fragment of role logic. In Section 4 
we show that RL^ corresponds to the decidable fragment 
of two-variable first-order logic with counting, and is there- 
fore decidable. Figure 2 presents the formula P denoting the 
fact that WaitingClients objects have no incoming or outgo- 
ing edges. If we apply the decision procedure for RL'^, we 
can show that Globallnvariant ^ P is a valid formula, which 
means that P is a logical consequence of Globallnvariant. By 
querying whether the Globallnvariant implies properties of 
interest such as P, the developers can increase their con- 
fidence in the correctness and completeness of the design. 
Moreover, our technique can be used to show the confor- 
mance of the program with respect to the design. 



proc assignClientsO = 

spec old(Globallnveiriant) => ! {WaitingClients} & 
[AssignedClients <=> 

oldCAssignedClients I WaitingClients)] & 
Globallnvariant 

proc assignClientsIMPLO = { 
if ({WaitingClients}) { 
cl := getWaitingClient ; 
assignOneClientlMPL(cl) ; 
assignClientsIMPLO ; 

>} 

claim: assignClientsIMPL => assignClients 

proc assignOneClient (cl) = 

spec old (Globallnvariant & 

[cl => WaitingClients]) => 
[WaitingClients I cl <=> old(WaitingClients)] & 
[AssignedClients <=> old(AssignedClients) I cl] & 
Globallnvariant 

proc assignOneClientlMPL(cl) = {. 
sv := getServerO; 
if (Card (sv' & clients) <= 4) { 

WaitingClients := WaitingClients \ cl; 
AssignedClients := AssignedClients I cl; 
cl . server := sv; 
sv. clients := sv. clients I cl; 
} else ■[ 

assignOneClientlMPL(cl) ; 

}} 

claim: assignOneClientlMPL => assignOneClient 

proc getWaitingClient : set = 
spec {WaitingClients} => 

skip & [returned => WaitingClients] 

proc getServerO : set = 
spec {Servers} => 

skip & [returned => Servers] 



Figure 3: A fragment of a program that assigns 
WaitingClients to Servers 
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Program fragment. Figure 3 shows a fragment of the 
code of the client-server manager. The top-level procedure 
in the code is a tail-recursive procedure assignClientsIMPL 
that processes all WaitingClients objects and assigns them 
to Servers objects. The assignClientsIMPL procedure ter- 
minates if there are no WaitingClients objects. Otherwise, it 
uses the getWaitingClient procedure to obtain an element 
of WaitingClients and assigns it to some Servers object us- 
ing the assignOneClient procedure, and continues with the 
next WaitingClients object using a tail-recursive call. 

The partial correctness of the procedure 
assignClientsIMPL is given using the specification 
assignClients. The requirement that the procedure 
conforms to its specification is stated using the construct 

claim: assignClientsIMPL => assignClients 

The verification of each procedure call site uses only pro- 
cedure specification (summary) instead of the body of the 
procedure, which allows verification of recursive proce- 
dures. In this example, the implementations of procedures 
getWaitingClient and getServer are not available, which 
illustrates the advantage of assume/guarantee reasoning for 
partitioning a verification task. 

Using the translation in Section 5.1, the claim constructs 
are reduced to verification conditions expressed in role logic. 
For a large class of constructs presented in Section 5.1, and 
our example in particular, the resulting verification condi- 
tions belong to the decidable RL^ and can therefore be dis- 
charged using a decision procedure for RL^. 

Note that we are able to express detailed specifications 
of the correctness of procedures while remaining in the de- 
cidable logic. For example, the specification assignClients 
ensures that the entire global invariant in Figure 2 is pre- 
served, and that no client objects are lost in the assignment 
process: after assignClients, the set AssignedClients is the 
union of the old value of AssignedClients and the old value 
of WaitingClients, whereas the new value of WaitingClients is 
an empty set. 

3 A Recipe for Role Logic 

In this section we motivate the role logic by constructing 
it in several steps. We start with first-order logic encoded 
in the simply typed lambda calculus; we then move to the 
notation that refers to each variable by its index. Finally, we 

impose a rule for implicitly supplying the indices of variables 
to predicate symbols. Later, in Section 3.6, we summarize 
the syntax and the semantics of role logic, and in Section 4 
we present a decidable sublogic of role logic. 

3.1 Lambda Calculus 

Figure 4 presents simply typed lambda calculus with explicit 
type annotations in lambda abstraction (the Church-style 
simply typed lambda calculus [5, Section 3.2]). This calculus 
is our starting point. 

As primitive types we use bool for boolean values, and 
obj for objects. As the only type constructor we use arrow 
—^. We introduce rel*' as a shorthand type defined by 



Form 



Vars 



variable lookup 
Vars = {a;, /,...} 



= bool 
= obj - 
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Simple types enable us to give a simple set-theoretic seman- 
tics to formulas by interpreting lambda abstractions as total 



I Form Form function application 

I AVars : Type. Form function abstraction 

SynteLx 

nv) = T 

r\-v:T 

r \- Fi : Ti ^ Ti, r\-F2:Ti 
r I- F1F2 : T2 

r[v := Ti]\- F : T2 
T\-{\v:Ti.F) : Ti ^T2 

Types 

\v\e = ev 
[Fi F2] e = ([Fi]e) ([Faje) 
IXv.T.Fje = Ad.[F] (e[w := d]) 

Semantics 

Figure 4: Church-style Simply Typed Lambda Calculus 



functions. The resulting semantics is in Figure 4; the seman- 
tics is straightforward because we use lambda calculus itself 
as our metar-notation. 

3.2 De Bruijn Notation 

An alternative to referring to each bound variable by its 
name is to refer to each variable by its number, with number 
1 denoting the most recently bound variable. This is the 
idea behind de Bruijn indices for lambda calculus [22, 4]. 
Figure 5 presents the syntax and the semantics of lambda 
calculus notation with de Bruijn indices. The environment 
maps the keyword stack to a stack (i.e., a list) of elements 
of the domain. If /i is am element and I a list, then the 
notation h : I denotes the list with the head h and the tail 
I. The abstraction pushes a value onto the stack; the index 
(k) retrieves the k-th element from the top of the stack. 

3.3 Predicate Logic in Lambda Calculus 

We next encode first-order logic with equality in lambda 
calculus. We use EQ to denote the binary equality relation. 
We assume that the interpretation of relation symbols is 
specified in the environment e. We introduce conjunction 
and negation as logical operations acting on booleans (the 
remaining prepositional operations are defined in terms of 
A, -1, as usual). We use the abstraction in lambda calculus 
to encode bound variables of predicate calculus. This is 
the usual higher-order logic encoding of classical first-order 
logic, as used, for example, in Isabelle interactive theorem 
prover [58]. Figure 6 presents this encoding of quantifiers. 
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variable lookup 
Form = (Nat) Nat = {1, 2, . . .} 

I Form Form function application 

I A:Type.Form function abstraction 

SynteLx 

[{i)]e = getie 

[X:T.F]e = Ad. [F] (push d e) 
Semantics 

get ie = nth i (e stack) 

pushde = e[stack := d : (e stack)] 

nth l{h:l) = h 

nth (i + 1) (/i : /) = nihil 

Auxiliary Functions 

Figure 5: De Bruijn Form of Simply Typed Lambda Calcu- 
lus 



{F} = V(A:obj.F) 
[F] ^ ^{^F} 

Quantifier Brackets 

When r(r) = rel*" then write r 

instead of r(fe) (fe-l) . . . (1) 

Default Argument Rule 



F' 

card-'^F 

card='=F 
(EfciCardFi) >fe 
(Er=i CardF,) = A; 

disjoint Fi,...,F„ 



(AAF)(1)(2) 
(AAF)(2)(2> 

{*= (AF)(1)A... A(AF)(ft)A 

Ai<.<,<fe-EQ{i){j> 
card^'=FA^card^'=+iF 

V Ar=icard^'=-F, 

V Ar=iCard='=^Fi 

[ A -^iF./\Fj)] 

l<i<j<n 



disjoint Fi, . . . , A 

Fi\F2 = FiA ^F2 
Shorthands 
Figure 7: de Bruijn form of Predicate Calculus 



EQ :: re|2 

[EQlxy = {x = y) 

A :: bool — > bool bool 

|A]pg = phq 

-1 :: bool — > bool 

hi p = 

3 :: re|i bool 

p]/ = 3oG[obj]. /o 

3v.F = 3{Xv : obj. F) 

\/v.F = -.31I.-.F 

Figure 6: First-Order Logic in Lambda Calculus 



To remain within first-order logic, we require the quantifier 
3 to have monomorphic type (obj — » bool) — > bool (see also 
Section 3.7). 

3.4 Implicit De Bruijn Indices 

Figure 7 shows how wc combine the encoding of first-order 
logic in higher-order logic and de Bruijri's notation for 
lambda calculus. 

Example 1 First-order predicate calculus formula 

VsVj/. f{x,y)^A{x)AB{y) 

can be written in this notation as 

[[/(2)(1)^A(2)AS(1)]] 

The outermost [ ] bracket acts as the quantifier Va;; the vari- 
able X is referred to inside the formula as (2) because it is the 
second innermost bound variable. The innermost [ ] bracket 
acts as Vy; the variable y is referred to as (1). 



The interpretation environment e contains both the stack for 
de Bruijn indices and the bindings of relation symbols such 
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as A and / in Example 1. Relation symbols of predicate logic 
correspond to variables of type rel*. We use the abstraction 
over de Bruijn indices A -.T.F only when T = obj, and write 
this abstraction simply XF. For every environment e, the 
value (e stack) is a list of elements of type obj. 

We next introduce the Default Argument Rule: we omit 
de Bruijn indices from the expression r{k}{k—l} ... (1) when 
r is a relation symbol, that is, when r(r) = rel'°. We in- 
terpret every occurrence of variable r when r(r) = rel'° as 
r{k}{k-l}...(l). 

Example 2 The Default Argument Rule means that in- 
stead of 

[[/(2)(1)^^(2)AB(1)]] 

we write 

[[f^{XA){2)AB]] 
when r(/) = rel^ and r(^) = r(B) = rel^ 

♦ 

We lose no expressive power by the Default Argument Rule. 
For example, if we wish to denote r{ia)(i2){'ii), we write 
(AAAr)(i3){i2)(ii). Note that the Default Argument Rule 
applies only to the relation symbols, not to all subformulas, 
so (AAAr) with Default Argument rule is equivalent to r 
without Default Argument Rule. In general, if r is an n-ary 
relation, we write {{X)''r){ik) (ik-i) ■ ■ ■ (ii) where we would 
previously write r{ik}{iki) ■ ■ ■ (h)- 

3.5 Shorthands 

Figure 7 introduces some shorthands. Tilde ~ swaps two 
topmost stack elements (1) and (2). Prime ' replaces the 
top (1) with the element (2). An expression card-'^F, for an 
integer k > 0, corresponds to a counting quantifier in first- 
order logic [30]. A counting quantifier states that the num- 
ber of elements with some property is greater than or equal 
to k. Figure 7 also introduces the shorthand for card~''F 
and the shorthand Card for specifying a constraint on a sum 
of cardinalities. The shorthands containing < are defined 
similarly. 

These shorthands play two purposes. On the one hand 
they allow expressing certain properties in a more concise 
way. On the other hand, if we use the shorthands but give up 
the ability to refer to indices explicitly, we obtain a fragment 
of first-order logic that is equivalent to two-variable first- 
order logic with counting (Section 4) and therefore decidable 
[30]. 

Example 3 Using the shorthands, we write the formula 

VxVy. fix,y)^Aix)ABiy) 

as 

[If^A'AB]] 

The convenience of role logic is even more evident in larger 
formulas like 

Vcc. A{x) i\fy.fix,y) B{y) V C{y)) A 
{yz.g{x,z)^D{z)) 

which can be written as 

[A^[f^BvC]A[g^ D]] (1) 



F* = rtranci (AAF) (2) (1) 

[rtranci] rxy = 3n> 03zo, . . . , Zn- zo = x A Zn = y /\ 
l\lZorziZi+i 

F10F2 = {(AAFi)(3>(l> A (AAFi){2)(l)} 

F+ = FoF* 

acyclic F = -^{F+ A EQ} 

tree Fi,...,F„ = acyclic Vr=i 

Er=iCard(~FO<l] 

Figure 8: Transitive Closure Construct and Shorthands 

Formulas of form (1) are useful for describing properties of 
first order structures that arise in shape analysis, see e.g. 
[48, 47, 71]. 

♦ 

For additional expressive power we introduce the 

reflexive-transitive closure operator *, with the semantics in 
Figure 8. We also introduce a shorthand for relation com- 
position. The relation composition shorthand works when 
Fi and F2 both denote binary relations, when the resulting 
expression can be thought of as denoting a binary relation, 
as well as when Fi denotes a set and F2 denotes a binary re- 
lation, when the resulting expression denotes the set which 
is the image of Fi under F2. For the case of relation we also 
introduce a simpler definition in Figure 13 whose advantage 
is that it uses only two implicit indices. 

3.6 Role Logic 

Figure 9 summarizes the syntax of role logic. The semantics 
of role logic follows from Section 3. 

We next explain the purpose of lambda abstraction in 
our logic. 

3.7 Lambda Calculus for Predicate Definitions 

In the resulting role logic of Figure 9 we retain the named 
variables in the environment, and we allow abstraction over 

those named variables. As a result, there two kinds of 
lambda abstraction: abstraction over de Bruijn indices and 
abstraction over named variables. Abstraction over a de 
Bruijn index is always over (1) which denotes an object 
of type obj, such abstraction is written AF. The abstrac- 
tion over a named variable may abstract over variables of 
more complex types and is written Xx : T.F. There is only 
one kind of lambda calculus application; both (AFi)F2 and 
(Ax : T.F\)F2 are redexes. 

The purpose of the named lambda abstraction Ax : T.F 
is twofold. First, when T = obj, then we can write 3{Xx : 
obj.F) as 3a::. F as in the usual flrst-order predicate calculus. 
Second, when T is not obj, we can encode acyclic definitions 
of higher-order predicates that can be subsequently substi- 
tuted away. Define the expression 

let P : T = Fi in F2 
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Form = Vars 
(Nat) 
EQ 

Form A Form 
-iForm 
3 Form 

AForm 

A Vars : Type . Form 
Form Form 
Form' 
~Form 
card-* Form 
Form* 



named object or predicate 
de Bruijn index of an object variable 
equality between (1) and (2) 
conjunction 
negation 

existential quantification over objects 
dc Bruijn abstraction over objects 
abstraction over named variables 
function application 
let (1) be (2) in F 
relation inverse 
at least k objects satisfy F 
reflexive transitive closure 



Figure 9: The Syntax of Role Logic 



to be equivalent to 

(AP : T . F2)Fi 

Such definitions are very useful for describing complex data 
structures. 

Note that acyclic definitions introduced through typed 
lambda calculus via bindings Xx : T.F for T ^ bool do 

not make the logic higher-order, because wc define the the 
quantifier 3 to always have the monomorphic type (obj 
bool) bool, and the reflexive-transitive closure operator * 

to have the type 

(obj obj bool) (obj obj bool) 

Consider a well-typed formula F whose only free variables 
are relation symbols, and whose de Bruijn indices only re- 
fer to indices bound in the formula. Assume that wc have 
applied the Default Argument Rule, so that all dc Bruijn in- 
dices are explicit. Then we may treat de Bruijn abstraction 
as the usual abstraction over a disjoint set of variables. By 
strong normalization of simply typed lambda calculus [5], 
let be the normal form of F. We claim that in F° the 
only occurrence of lambda abstraction is within expressions 
of the form 3(Aa;- : obj.F) or rtrancl(A2; : obj. Ay : obj.F). 

To show the claim, consider an occurrence of Ax : obj.Fo 
in Let Fi be the largest enclosing occurrence \xi : 

Ti \Xn : Tn.Xx : obj.Fo. Then Fi cannot be the entire 

because F° has type bool by subject reduction. Fi 
cannot occur within some application FiF-2, because F1F2 
would constitute a redex and F° is in normal form. Hence, 
Fi can only occur in an expression of the form F3F1. Let 
us consider the "spine" [38] of F3F1, so F3 = F„F„_i . . . F4 
n > 3 and F„ is not an application. F„ is not an abstraction, 
because F'^ is in normal form. Hence, F„ can only be a 
variable or a constant. 

The only variables or or constants that can, by the typing 
rules, be applied to an abstraction Fi arc 3 and rtranci, so 
either F„ = 3 or F„ = rtranci. 

Consider the case F„ = 3. By the type of 3, we conclude 
Fa = F„ and Fi = \x : obj.Fo, as desired. 

Consider the case F„ = rtranci. Then F3 = Fn, and 
Fi = \u : obj.Xv : obJ.G, so either u = x and Fi = Aa;. obj.Fo 



where Fo — Xv : obj.G, or v = x and Fi = Aw : obJ.Aa; : 
obj.F). This firushes the proof of the claim. 

We conclude that typed lambda calculus allows us to use 
flexible deflnitions of higher-order predicates to structure 
our speciflcations while keeping the language first-order, be- 
cause we may substitute away all definitions using strong 
normalization of the typed lambda calculus. 

4 Role Logic Subset RL'^ and its Decidability 

In this section we introduce a subset RL^ of role logic (Fig- 
ure 11) and show its decidability. 

To show the decidability of RL^, we give translations of 
formulas between the following four logics: 

1. D^: the formulas of the first-order logic with count- 
ing in which every subformula has at most two free 
variables (different subformulas may have different free 
variables) ; 

2. C^: the formulas of the two- variable logic with count- 
ing, which uses x and y as the only variable names; the 
satisfiability and finite satisfiability problem for was 
shown to be decidable in [30] ; the satisfiability problem 
for was shown NEXPTIME-complete in [57]; 

3. I^: de Bruijn version of the two- variable logic with 
counting, which uses only de Bruijn indices (1) and 

(2>; 

4. RL^: a subset of role logic that contains no explicit de 

Bruijn indices. 

Figure 10 sketches the idea of the proof of equivalence of 
these four logics. We give translations of formulas from 
to (Section 4.2, Figure 15) from to 7^ (Section 4.3, 
Figure 18), from to RL^ (Section 4.3, Figure 19) and 
from RL^ to D"^ (Section 4.4, Figure 20). These translations 
imply that the satisfiability problem for these four logics are 
equivalent, so by decidability of [30] we conclude that all 
these logics are decidable. 
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Fig. 15 



Fig. 20 



RL2 



Fig. 19 



Fig. 18 



quantifiers: 

{F} = card^^F 
[F\ = ^{^F} 
relation image: 

FA'Fr = {Fa a ^Fr} 
weakest precondition: 

WpFrFA = [Fr^FA] 



Figure 13: Some Shorthands for RL^ 



Figure 10: Showing Equivalence of Four Logics. 



Form = Vars binary or unary relation symbol 

EQ equality between (1) and (2) 

Form A Form conjunction 

^Form negation 

Form' let (1) be (2) in F 

~Form relation inverse 

card-* Form at least k objects satisfy F 

Figure 11: The Syntax of RL^ Subset of Role Logic 



Nat2 

e 

lA]e 

me 

[EQle 

[Fi A F2je 

IF'je 
[~F]e 
[card^^FJe 



{1,2} 

Nat2 — » obj 

me I) 

I/l(e2,el) 
(e2) = (el) 
([File) A ([F2]e) 

[Fl(e[l^(e2)]) 
[F](e[l^(e2),2^ 
|{o| [F](e[l^o,2i 



(el)]) 

-(el)])}|>fc 



Figure 12: The Semantics of RL^ 



4.1 The Role Logic Subset RL^ 

Figure 11 presents the two- variable role logic RL^. Com- 
pared to the full role logic in Figure 9, RL^ omits the con- 
structs for creating definitions, the constructs for explicitly 
referring to object variables, and transitive closure. Fig- 
ure 12 summarizes the semantics of RL^; this semantics is in 
accordance with the semantics of the full role logic derived 
in Section 3. Figure 13 defines shorthands that illustrate 
some constructs definable in RL^. 

We show that RL^ has precisely the same expressive 
power as the set of the formulas of logic , which is shown 
decidable in [30] over the set of all models, as well as over 
the set of finite models. 

4.2 Two- Variable Logics and 

Figure 14 presents the logic [30]. The logic is first- 
order logic with equality and counting, restricted to formulas 
that contain only two fixed variable names x and y. 

In this section we argue that a more fiexible restriction 
on variable names yields logic with same definable relations. 
Let FV(F) denote the free variables of formula F. 

Definition 4 A formula is a formula F of first-order 
logic with counting such that |FV(G)| < 2 for every subfor- 
mula G of F. 

Clearly every formula is a formula, but not vice 
versa, because the set of possible variables that may occur in 

formulas is countably infinite. The syntactic restriction 
on variables in Definition 4 is more general than in the def- 
inition in C^, which makes more convenient for writing 
readable formulas. 

We show that every formula is equivalent to a 
formula (modulo the renaming of free variables) . Up to one 
technical detail, it suffices to rename bound variables in a 

formula to obtain a formula. We therefore derive the 
equivalence of and as a consequence of an observation 
about lambda calculus terms. 

Definition 5 Define the set of lambda calculus terms 
2VarTerms as the smallest set that satisfies the following con- 
ditions: 

1. V € 2VarTerms ifv is a variable and c € 2VarTerms if c 
is a constant; 

2. ifTi,T2 € 2VarTerms and |FV(ri) U FV(r2)| < 2, then 
{T1T2) € 2VarTerms; 
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Vars2 = {a;, y} 

Form = yl(Vars2) atomic formula with unary relation A 

I /(Vars2, Vars2) atomic formula with binary relation / 

Vars2 — Vars2 equality between objects 

I Form A Form conjunction 

I -iForm negation 

I 3-*Vars2. Form at least k objects satisfy formula 



Figure 14: The Syntax of Two- Variable Logic with Counting 



3. ifTG 2VarTerms, v is a variable, and |FV(r) U {v}\ < 
2, then Xv.T € 2VarTerms. 

EYom Definition 5 it follows that if T e 2VarTerms, then 
|FV(Ti)| < 2 for every subterm Ti of T. Moreover, if Aw.T e 
2VarTerms and v ^ FV(T), then |FV(r)| < 1. 

We next define the set capt(t;, F) of those bound variables 
z in formula F such that v occurs in the scope of a binding 
of z. 

Definition 6 

capt(«,u) = %, if u is a variable 

capt(?;,FiF2) = capt(«, Fi) U capt(ii, F2) 

f capt(v, F) U {«}, ifve fy{\u.F) 
capt(w, Xu.F) = < 

I 0, otherwise 

As usual, we say that T and T' are Q-equivalent if T' can 
be obtained from T by renaming bound variables. 

Lemma 7 For every T e 2VarTerms with FV(T) C {u, v} 
there exists a term T' = norm(r) such that T' is a-equivalent 
to T, all bound variables in T' are among {x, y}, and either 

1. capt(w,T') C {a;} and capt(v,T') C {y}, or 

2. capt(w,T') C {y} andcapt(v,T') C {a;}. 

Proof. Let FV(r) C {u, w}. Without loss of generality 
we may assume that {m, v} fl {x, t/} = 0. The proof is by 
induction on the structure of terms. 

1. T = u for a variable u. Let T = T', clearly 
capt(M,T') = capt(t;,T') = 0. 

2. T = T1T2. Let Ti = norm(Ti) and = norm(r2) by 
induction hypothesis. Assume capt(w, Ti) C {a;} and 
capt(v,T'i') C {t/} (the other case is symmetric). We 
consider two cases for T2. 

(a) capt(u,r2) C {x} and capt(v,T2) C {y}. Then 
let norm(T) = TiT^. 

(b) capt(u,r2) C {y} and capt(i;,r2) C {x}. Let T2 
be the result of swapping in T2 all occurrences of 
bound variables x and y. Then capt(M, T^) C {x} 
and capt(v, T^') C {y}, so we let norm(T) = T{T^'. 

In both cases, capt(M, norm(T)) C {a;} and 
capt(v,norm(T)) C {y}. 



3. T = Aw.Ti. |{M,t;}| = 2 and |FV(Ti) U {w}\ < 2 
by the definition of 2VarTerms, so it cannot be the 
case that both u £ FV(Ti) and v € FV(ri). Since 
FV(ri) C {u,v,w}, we conclude that FV(Ti) C {u,w} 
or FV(Ti) C {v,w}. 

Suppose therefore that FV(Ti) C {u,w} (the case 
FV(Ti) C {v,w} is symmetric). By induction hypoth- 
esis, let Tl = norm(Ti). Assume capt(M, Ti) C {x} 
and capt(M;,Ti) C {y} (the case capt(M,Ti) C {y} 
and capt(w,Ti) C {a;} is symmetric). Let norm(T) = 
Xx.{Fi[w := x]). Then capt(M, norm(r)) C {a;} and 
capt(v, norm(r)) = C {y}. 



To apply Lemma 7 to formulas, we represent all log- 
ical operations and quantifiers as constants. Variables in a 

lambda term then correspond to first-order variables. To 
ensure that the representation of formulas satisfies the con- 
dition |FV(T) U {t;}| < 2 for each term Xv.T, we require the 
following condition: 

For every formula B-^a;. F, 
either x € FV(F) or F s true. ^ 

We ensure this condition by applying the rule 

3-''x.F ^ FA3-*a;.true 

for X ^ FV(F). 

After ensuring the condition (2), we apply the transla- 
tion in Figure 15. Lemma 7 justifies the correctness of the 
translation. The translated formula is of the same size as the 
original formula. The translation can clearly be performed 
in polynomial time, including the process of ensuring the 
condition (2). The translation time cam be made close to lin- 
ear by delaying the application of the substitution [w := a;] 
and the swap operation. 

4.3 From to RL^ via 

In this section we introduce logic (Figure 16). We then 
give translations from to (Figure 18), and from 7^ to 
RL^ (Figure 19). 
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TdcIA{v)] 
TDclf{u,v)j 



TDclFiAF2j 



A{v) 



) 

Fi A Fi, 



if capt('(i, Fi'), capt('u, F^) C {x} 
capt(t;,FO,capt(t;,F:^) C {y} 
or 

capt(w,Fi'),capt(M,F^) C {y} 
capt(v,FO,capt(v,F^) C {a;} 



F[ A (swap F2), otherwise 

FV(Fi A F2) = {w, v} 

F[=TDc[F{i 

F!,=Tdc[F2\ 

swap (^(w)) 
swap(/(u, v)) 
swap (-iF) 
swap (Fi A F2) 
swap(3-'=«.F) 



f{su,sv) 
-i(swapF) 
swapFi A swapF2 
3^*(sv).(swapF) 



sx = y, sy = X 
su = u, if u ^ {a;, y} 

3^''x.{F'[w := a;]), if capt(w,F') C {a;},capt(w,F') C {y} 
3^'=2/. (F'lw := 2/]), if capt(«, F') C {j/}, capt(w, F') C {a} 

FV(F) C {u, w} 
F' =Tncm 



Figure 15: Translation of formulas to formulas. 



Form — A((Nat2)) atomic formula with unary relation A 

/((Nat2), (Nat2)) atomic formula with binary relation / 

(Nat2) = (Vars2) equality between objects 

Form A Form conjunction 

-iForm negation 

card-*Form at least k objects satisfy formula 



Figure 16: The Syntax of Intermediate Logic 
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e Nat2 — » Vars2 

TiclAime = A{ei) 

^/c[/({ii>,(i2))]e = fieii,ei2) 

Ticl{ii)={i2)je = (eii) = (eia) 

TiclFiAFije = {TiclFi]e) A (TiclF^je) 

TichFje = ^{TiclFje) 

Ticlcard^^Fje = B^^v. {TiclF][l ^ v,2 ^ {el)]) 
V = s(e 1) 

sx = y, sy = X 

correctness criterion: 

lTiclF}e\ec = [Fl(ecoe) 

Figure 17: Translating 7^ formulas to formulas 

Intermediate logic. Figure 16 presents logic 7^. 7^ is 
a version of that uses two dc Bruijn indices instead of 
variables. We introduce 7^ to separate the the translation of 
Cf^ formulas to RL^ in two phases; the first phase introduces 
de Bruijn indices, and the second phase introduces Default 
Argument Rule. 

For the sake of illustration, we first present a converse 
translation, from 7^ to , although we do not need this 
translation to show the equivalence of 7?^, C^, 7^, and RL^. 

Prom 7^ to Fi gure 17 presents the translation of 7 

into . This translation amounts to introducing alterna- 
tively variables x and y for each counting quantifier, and 
resolving the indices appropriately. Using the criterion in 
Figure 17, the correctness of the translation follows by in- 
duction on the structure of formulas. 

From to . We turn to the translation from to 

7^. Consider the C'^ formula 

F = 3^'y.{3^'x.{3^'x.P{x,y))AQ{x,y)) 

The subformula P{x, y) of F refers to the variable y, which 
is the 3rd bound variable starting from the innermost one. 

Therefore, the straightforward replacement of variables by 
de Bruijn indices would require the access to (3). To ad- 
dress this problem, the translation from to 7^ involves 
a preparatory "alternating transformation" on formulas. 
For every formula F, let B{F) denote some purely proposi- 
tional combination of F and perhaps some other formulas. 
The alternating transformation eliminates all subformulas 
of the form 3^'=it;. B(3^'=2t). G{v)) for v € Vars2. In the re- 
sulting formula, the sequence of bound variables along any 
path in the formula tree is alternating, that is, satisfies the 
regular expression {y\e){xy)''{x\e). 

For the purpose of alternating transformation, wc add 
the disjunction V to the language. We show how to eliminate 
successive quantification over x from 3-''^x. B{3-''^x.G) 
(the case of 3-''^y. B{3-''^y.G) is analogous). First, trans- 
form B into disjunction of canonical conjunctions of for- 
mulas 77, where each 77 satisfies one of the following three 
conditions: 



e Vars2 — > Nat2 

TcilA{v)]e = A{{ev)) 

Tcilf{vi,V2)]e = f{{eii),{ei2)) 

Tcilvi=V2je = (evi) = {ei2) 

Tci\FiAF2}e = (TciIFiJe) A(Tc/[F2]e) 

TcihF}e = -(Tc4F]e) 

Tc/p^'=a;.F]e = card^'=(Tc/[F][a; 1, j/ 2]) 
invariant: e y = 1 

Tci{3^''y.F\e = card^'=(Tc/[F][yH^ l,x-H^2]) 
invariant: ex = 1 

correctness criterion: 

lTcilF\e}ei = [F](e/oe) 

Figure 18: Translating normalized formulas to P formu- 
las 

CI) 77 is quantifier-free; 

C2) 77 is of the form 3^'' v. G{v) for v € Vars2; 
C2) 77 is of the form -Pt^^v. G{v) for v € Vars2; 

Let B = Vr=i where each Bi is a canonical conjunction 
(cube) of formulas satisfying conditions CI), C2), C3). Be- 
cause Bi A Bj is contradictory for distinct cubes Bi and Bj , 
the sets of objects o satisfying different Bi are disjoint, so 

n 

\{o 1 IBMv ^ o]}| = ^ |{o I lB4e[v ^ o]}| 

i=l 

Wc can therefore replace counting quantifier on B with a 
prepositional combination of counting quantifiers on Bi for 
1 < i < n (as in quantifier elimination for boolean algebras, 
[67], [49, Section 3.2]). Specifically, 

n 

3^'''x.B ^ y /\3^^*x.Bi (3) 

It is therefore sufficient to eliminate the successive quantifi- 
cation over X in B-'^'^x. Si(3-''^a;. C). Group the conjuncts 
in Bi as follows. Let F\/{F) denote free variables of formula 
F. Let P{x) be the conjunction of conjuncts C of Bi such 
that X € FV(C), and let Q be the conjunction of all con- 
juncts C of Bi such that x ^ FV(C). All occurrences of 
a^'^^x.G in Bi are in Q. We have 

3^''^x.Bi ^ 3^''^x.Q AP{x) ^ Q A3^''^x.P{x) 

where the last equivalence follows easily by definition of 
the counting quantifier In the resulting formula 

Q A B-^'ki. P{x), the subformula 3-'^^x.G is in Q and is 
therefore not in the scope of the original quantifier. By re- 
peating this transformation we ensure that all quantifiers 
are alternating. 
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= 


A 




= 


A' 


T«[/((2),(l))] 


= 


f 


T«[/((l),(2))] 


= 


-/ 


T,h[/((2>,(2))1 


= 


/' 


TjnU{{l),m 


= 


-(/') 


Tinm = (1)1 


= 


EQ 


TirHI) = (2)1 


= 


EQ 


Tinlil) = {1)1 


= 


true 


TirI{2) = (2)1 




true 












^TirIF] 






card^^TjfllF] 



correctness criterion: 

ITiRlFJjei = IFjei 



Figure 19: Translating formulas to RL^ formulas 

After the alternating transformation, the translation 
from to is straightforward, and is presented in Fig- 
ure 18. The correctness of the translation follows by in- 
duction of the structure of formulas. The translation in 
Figure 18 runs in hnoar time and produces an formula 
whose size is linear in the size of the original formula. 

The alternating transformation that precedes the trans- 
lation may cause exponential blowup of the fornnila size due 
to translation to disjunctive normal form, but for most for- 
mulas the transformation need not be applied. Moreover, 
if we allow introducing new predicate names, then we may 
replace B'^'^^x. B{3'^''^x.G{x,y)) with B^^^x. S(P(y)) and 
conjoin the topmost formula with the formula 'iy.Pijj) 
3-''^x.G{x,y). Such transformation can be performed in 
linear time and preserves the satisfiability of formulas (see 
[30, Section 2.1, Page 18] and [30, Lemma 2.3]). 

From to RL^. Figure 19 presents the translation from 
to RL^, which is simple and does not require a translation 
environment. The translation algorithm runs in linear time 
and produces a RL^ formula whose size is linear in the size 
of the original 7^ formula. 

4.4 From RL^ to D^: Closing the Loop 

In the final step, we provide a translation from RL^ formulas 
to formulas. The logic is a convenient target of trans- 
lation of RL^ formulas. (Namely, a simple attempt at trans- 
lation from RL^ to runs into the difficulty of the following 
form. Formula (card-"'/)' is equivalent to card-'/((3), (1)) 
which uses index (3) not available in I^. Similarly, an at- 
tempt to translate from RL'^ to runs into difficulty of 
variable capture.) 

Figure 20 presents the translation from RL'^ to . The 
correctness of the translation follows by induction on the 



eO e Nat 

ek € {2/1,2/2,...} for fee {1,2} 

TrdIAJc = A{el) 

TRoUh = /(e2,el) 

TRDlEQje = (e2) = (el) 

TRolFiAFije = (TflD|Fi]e) A(TflD[F2]e) 

TRD^Fje = ^(TRolFje) 

TRDlcard^''Fle = 3^S. [F]e[0 n, 1 w, 2 (e 1)] 
V = yn 
n = l + eO 

TflDhFle = TflD[Fl(e[l K-> (e2),2K^ (el)]) 
TRolF'je = T«o[F](e[l (e2)]) 

correctness criterion: 

lTRD{F}e\ec = m{ecoe) 
result is in D^: 
FV(TflD[F]e) C {el, 6 2} 

Figure 20: Translating RL^ formulas to formulas. 

structure of formulas. Furthermore, each subformula Gi of 
a formula Trd \F\e is of the form Gi = Trd [G]ei for some G 
and ri, and by induction it follows that the free variables of 
TRD{G\ei are among {eil,ei2}. Therefore, |FV(Gi)| < 2 
and the result of translation is a formula. 

Summary As indicated in Figure 10, we have presented 
translations from to G^, from G^ to 7^, from 7^ to RL^, 
and from RL^ to D'^ . We conclude that D^,C^,I^, and RL^ 
are all equivalent logics, and, by [30], decidable. 

The satisfiability problem for G^ formulas is shown to be 
NEXPTIME-complete in [57]. We have observed that there 
are efficient polynomial transformations of formulas from 7)^ 
to G^ from G^ to 7^ from 7^ to RL^ and from RL^ to 
that yield formulas equivalent for satisfiability. (Moreover, 
all transformations except from G^ to 7^ yield equivalent 
formulas in the same vocabulary.) As a result, the satisfia- 
bility problem of all these logics is NEXPTIME-complete. 

5 Applications of Role Logic 

We next present three applications of role logic. In Sec- 
tion 5.1 we present a shape analysis technique based on 
generating verification conditions in RL'^ and applying the 
decision procedure for RL^. In Section 5.2 we note that 
boolean shape analysis constraints [48] are a subset of con- 
straints expressible in role logic. In Section 5.3 we show 
that a different subset of RL^ corresponds to an expressive 
description logic [1, Chapter 5]. 

5.1 Static Analysis Based on RL'^ 

This section shows how to use the decidability of RL^ for 
static analysis of imperative programs. Figure 21 presents 
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the syntax of a simple imperative language. Figure 22 
presents predieates in RL^ that describe the meaning of 
statements in this language. 

Program state. The state of the program is a first-order 
structure interpreting the language L = AUJ^ where is a 
finite set of unary predicates and is a finite set of binary 
predicates. We fix a countable universe of objects obj, and 
assume that each structure has the same universe obj. To 
specify the structure, it suffices to give the set eA C obj for 
each unary predicate A £ A, and a binary relation e/ C 
obj X obj for each binary predicate f G J^. 
Extended language. For each k € {e, 0, 1, . . .} we define 
the language We identify L(j) with L, A^^-^ with A 

and /(j) with /. For k € {0, 1, . . .}, we let ^(j.) be a fresh 
unary predicate symbol, and /(j.) a fresh binary predicate 
symbol, and Ljit) be the set of all and /(^j . The notation 
formRen (i j) F for i,j G {e, 0, 1, 2 . . .} denotes a formula 
resulting from F by replacing all elements of Z/(i) with the 
corresponding elements of Ly) . 

Describing relations in the extended language. The 

meaning of each statement in our imperative language is a 
binary relation on L-structures. We describe a binary re- 
lation on structures with an RL'^ formula in the language 
1/(0) U -Zj{c). The predicates in L(j) denote the state compo- 
nents in the final state; the predicates in L(o) denote the 
state components in the initial state. If is a formula 
in language i(e), then F is a shorthand for the formula 
formRen {e —> 0) F in the language I/(o); the purpose of F is 
to denote the value of the formula F evaluated in the initial 
state. 

Define the renaming operator strucRen (i j) such that 
if e(j) is an L(i) -structure, then e(j) — strucRen {i j) e^i) 
is an Lyj-structure such that e^j) Ay) = e(i) A(i) and 
^(j) hi) ~ Hi) hi) ^'-f ^ ^- Then the relation 

on L-structures denoted by an RL^ formula F in language 
L(„) U is {{e, e') | [FJ((strucRen (e ^ 0) e) U e')}. 
Assignment statements. The imperative language in 
Figure 22 contains three forms of assignment statements. 

The statement A:— F evaluates to the formula F, which 
denotes a unary predicate. The statement makes A true 
precisely for those object for which F was true in the ini- 
tial state. Unary predicates other than A as well as binary 
predicates remain unchanged. 

The statement F\.f:=F2 generalizes the statement 
x.f = y m a. language like Java by allowing simultaneous 
modification of fields of a set of objects. Formula F\ spec- 
ifies the set of objects whose fields arc modified. Formula 
F2 specifies the new value of the field / for objects in Fi. 
Unary predicates and binary predicates other than / remain 
unchanged. Note that F2 may specify a relation, which is 
particularly interesting when Fi denotes a set with more 
then one element because it allows the value of the field to 
depend on the source object of the field. As a special case, 
Fi.f :— g copies the entire field g into field / for all objects 
in the set given by Fi, and, in particular, true.f:=g copies 
the field g into /. The statement Fi.^f-.— Fz is dual to 
Fi./ : = F2, and updates the inverse of the predicate /. 
Statements for specification. The statement assume F 
filters out the state transitions for which F does not hold 
in the initial state. The statement assert F behaves arbi- 
trarily if the condition given by F does not hold in the ini- 
tial state. The state contains an additional predicate Error, 
which makes it easier to detect that an arbitrary behavior 



^lS2]{B^^Ai,...,B„^A„)) 
is not satisfiable, where: 

Pi{Ai,...,An)=Si 

P2(Bi,...,B„) = & 

!&] has no fresh predicates 

lA:=Fj = [A <^ F]AmodUnaryA 

[Fi./:=F2] = [JW [/ ^ A 
[-Fi ^ [/ ^ /]] A 
modBinary / 

[Fi.~/:=F2] = [JW h/ ^ J^]]_A 

[^Fi ^ [-/ ^ -/]] A 
modBinary / 

[P(Fi,...,F„)] = [Sj{Ai^F\,...,A„^F:) 
where P{Ai, . . . , A„) = S 
[assume F] = F A skip 
[assert F] = F => skip 
[specFl = [FI 
[S1AS2] = [silA[s2l 

[S1VS2] = [S1]V[S2] 

[si; 82} = formRen (e — » k) [si] A 

([-.Error] ^ formRen (0 -» k) [52]) 
k — fresh element of {1, 2, . . .} 



[modify F] 


= MlEj 




modUnary A 




^ B] A 






m A 




[Error 


Error] 


modBinary/ 




B] A 






=> fll] A 




[Error 


Error] 



skip = /\^[B ^ B]A 

Ag[[s ^ 9]] A 
[Error Error] 



Figure 22: Predicates Describing the Semantics of the Lan- 
guage from Figure 
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F — a role logic formula 
A — unary predicate 
/ — binary predicate 
procedure ::= procName(unaryList) = Stat 
refinement ::= procName => procName 
unaryList :~ A \ unaryList, A 
Stat ::= asgnStat 

procName(paramList) 
assume F 
assert F 
spec Fe 
Stat V Stat 
Stat A Stat 
Stat; Stat 
asgnStat ::= A:=F 

Fi.f:=F2 
Fi.~/:=F2 
Fe ::= I / I EQ I Fi A F2 I -F 
F' I ~F I card-*F 

asgnStat | modify items | procName(paramList) 
paramList ::— F | paramList, F 

items ::= modltem | items, modltem 
modltem ::= A :<= F 

I Fi./ :<= F2 
I Fi.~/ :<= F2 



assignment statement 
procedure call 
assume statement 
assert statement 
specification 
non-deterministic choice 
conjunction 
sequential composition 
update of unary predicate 
update of binary predicate 
update of inverse of binary predicate 



modification of unary predicate 
modification of binary predicate 
modification of inverse of binary predicate 



Figure 21: Syntax of a Small Imperative Language 
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proc assignClientsO = 
spec oldCGloballnvariant) => 
(modify WaitingClients, AssignedClients, 

old (WaitingClients) . server :<= Servers, 
Servers . clients :<= old(WaitingClients)) & 
! {WaitingClients} & 
[AssignedClients <=> 

old (AssignedClients I WaitingClients)] & 
Globallnvariant 

proc assignOneClient(cl) = 
spec old(Globallnvariant) & 

[cl => old(WaitingClients)] => 
(modify WaitingClients, AssignedClients, 
cl. server : <= Servers, 
Servers . clients :<= cl) & 
[WaitingClients I cl <=> old (WaitingClients)] & 
[AssignedClients <=> old(AssignedClients) I cl] & 
Globallnvariant 



Figure 24: Specifications for assignClients and 
assignOneClient extended with side effect specifica- 
tions. 

occurred (the sequential composition operator ensures that 
the Error value is propagated). 

The statement spec Fe allows describing relations on 
states directly in terms of an extended RL'^ formula Fe- For- 
mula Fe allows assignment statements and modifies state- 
ments in addition to the constructs of RL^. The relation 
symbols of RL^ may refer to relation symbols of the ex- 
tended language, which allows stating relations between pre 
and postcondition. We also allow non-recursive procedure 
calls in the specification when they expand to constructs not 
containing sequential composition, 
modify specifications. The construct 

modify ei, . . . ,e„ 

is useful for specifying frame conditions. Each expression d 
specifies a set of possible modifications. Any finite number of 
modifications can occur as the result of the action specified 
by the modify specification. 

Example 8 Figure 24 shows the specifications 
assignClients and assignOneClient from Figure 3 
extended with frame-condition specifications. The frame 
condition for assignOneClient specifies that only the 
sets WaitingClients and AssignedClients can change, which 
is useful if the system contains some additional set of 
objects, such as a set ProcessedClients. Next, the frame- 
condition specifies that the only binary relations that were 
modified are server and clients. The modifies expression 
(Servers.clients :<= cl) indicates that the the only way in 
which the clients relation is changed is by introducing an 
edge from a Servers object to the cl object, or by removing 
an edge from a Servers object. (The removal of the edge does 
not, in fact, occur in assignOneClientlMPL in Figure 3, 
but the frame condition is a conservative approximation.) 
The amount of detail in specifications such as modifies 
clauses depends on how strong property we need to prove. 
The strength of the property, in turn, depends either on 
some high-level program correctness requirement, or on 
the amount of information we need about the procedure 



to prove the properties of its callers. In Figure 3, we 
did not use modify specification for assignOneClient 
because we did not need it to prove the conformance 
of assignClientsIMPL with respect to assignClients. 
However, even in Figure 3 we needed to know that, for 
example, getServer preserves the global invariant, which 
follows from the fact that it does not modify any sets or 
relations (the conjunction with skip implies that getServer 
is a pure function). 

♦ 

In general, there arc three forms of modification expres- 
sions. The expression A :<— F specifies modifications that 
remove an element from the set A or insert into A an element 
that satisfies F. For example, after executing the statement 

modify yl :<= F 

the set A may contain any subset of the set of objects given 
by the expression A\/ F. The expression Fi.f :<— F2 spec- 
ifics modifications that 1) remove a tuple (01,02) from the 
relation interpreting the predicate /, when oi satisfies _Fi, 
or 2) insert a tuple (01,02) into the relation interpreting 
/, when oi satisfies Fi and (01,02) satisfies F2. Similarly, 
Fi.^f :<= F2 allows removing (01,02) from the interpreta- 
tion of / when oi satisfies Fi, or inserting (01,02) when 02 
satisfies Fi and (01,02) satisfy ~F2. 

If r, is the relation describing a modification given by the 
expression e, , then the meaning of modify ei, . . . , e„ is given 
by the relation 

(riU...Ur„)* (4) 

where r* denotes the transitive closure of relation r. The 
simple semantics (4) provides good intuition about the 
meaning of modify statement and makes it clear that the 
modify statement is idompotent [44]. Figure 23 presents an 
alternative semantics, which directly encodes a modify state- 
ment as an RL^ formula. The advantage of the semantics in 
Figure 23 is that it eliminates the need for transitive closure 
of the transition relation. 

Disjunction and conjunction. The language allows com- 
puting disjunction and conjunction on statements. Disjunc- 
tion V has a natural interpretation as a non-deterministic 
choice of commands. Conjunction A is useful for combining 
nondeterministic statements. Logical operations on state- 
ments translate directly to the corresponding logical opera- 
tions on RL^ formulas. 

Computing sequential composition. When encoding 

sequential composition of statements in RL^, we introduce 
copies L(i) of predicate names in L for i G {1, 2, . . .}. These 
copies of predicate names denote the values of predicates 
at program points between the initial and the final pro- 
gram state. Because the definition of relation composition 
n o r2 = {(x,z) I 3y. (x,y) e ri A {y,z) £ r2} involves exis- 
tential quantification over y, we treat the newly introduced 
predicates as being existentially quantified. The technique 
of introducing new predicate names allows us to precisely 
compute relation composition even for non-deterministic 
commands. 

Procedure calls. The meaning of a procedure is also a 
relation on states, where the initial state is extended with 
one unary predicate symbol for each parameter name. In 
the simple translation of Figure 22, a procedure call identi- 
fies parameters with the sets that describe their values by 
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A1[modify ei, . . . ,e„j = 
let {ei,...,e„} = 
{^1 :<= Fi,...,Ak :<= Fk, 
Fk+i-fk+i ■<= Gk+i, ■ ■ ■ , Fi.fi :<= Gi, 

Fl+l-^fl+l ■<= Gl+l, . . . , Fm-~fm ■<= Gm} 

in 

A IA^A]A 

A [hAA/\^^^^^Fi)^^A]A 

Ae{Ai,...,Ak} 

A [[/ ^ /]] A 

/^{/jc + l.-./m} 

A [[( A --F/ A A -F,) ^ if^m 

/e{A + l.-,/m} f.=j: f.=f 

A [[(-/A A -(J'/agOa a -(J^iA-Gi)) ^ -/]] 

/S{/fc+l,...,/m} /.=/ /.=/ 



Figure 23: Semantics of modify statement. 



performing the substitution. Substitution suffices to give se- 
mantics to procedures because we assume that the recursion 
is split using refinement claims. Loops are represented as re- 
cursive procedures, so wc effectively require loop invariants. 
Refinement claims. If Pi and P2 arc procedure names, 
the refinement claim Pi => P2 is a proof obligation that 
the relation given by the body of procedure Pi is contained 
in the relation given by the body of P2. The intended use 
of the refinement claim is the specification procedure sum- 
maries, which allows breaking the cycles in the call graphs 
of mutually recursive procedures. Figure 22 shows how each 
refinement claim reduces to a test whether an RL^ formula 
is satisfiable. When generating the RL^ formula, we rename 
the parameters of P2 replacing them with the corresponding 
parameters of Pi. 

To ensure that the satisfiability test treats newly intro- 
duced predicates as existentially quantified, we impose a re- 
striction that the translation [5*2] contains no newly intro- 
duced predicates from L(j) for i € {1, 2, . . .}. We impose this 
restriction because \S2\ appears under negation in the sat- 
isfiability test, so newly introduced predicates in [ft] would 
be universally quantified, thus violating the semantics of se- 
quential composition for non-deterministic statements. The 
restriction on 52 is satisfied when 52 contains no sequential 
composition, which is typically the case for a large class of 
procedure summaries. 

By providing sufficiently many procedure summaries, the 
partial correctness of a program is reduced to a finite number 
of refinement claims. By discharging these claims using a 
decision procedure for RL^, we decide the partial correctness 
of the program. 

Fixpoint computation. If some procedure summaries 
are not supplied by the programmer, they can be inferred 
using fixpoint computation. An algorithm for fixpoint com- 
putation can be derived from the fixpoint semantics of 
mutually recursive procedures using abstract interpretation 
[19, 21, 20, 70]. A special case of this approach is to select a 
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Figure 25: Boolean Shape Analysis Constraints expressed 
as a sublogic of RL^ 

finite subset of all RL^ formulas and define a lattice structure 
on the set using the entailment of formuleis. A simple way 
to define a finite subset of formulas is to consider only RL^ 
formulas with quantifier depth at most k, for some fc > 1. 
Boolean shape analysis constraints in Section 5.2 have quan- 
tifier depth at most two, so they can be used as a basis of 
fixpoint computation. 

5.2 Describing Boolean Shape Analysis Con- 
straints 

Boolean Shape Analysis Constraints [48] are a natural lan- 
guage for describing dataflow facts of shape analyses [65]. 

Figure 25 presents the syntax of Boolean Shape Analy- 
sis Constraints as a subset of role logic. This presentation 
of Boolean Shape Analysis Constraints shows that they are 
a subset of the decidable fragment RL'^ of role logic. In 
fact. Boolean Shape Analysis Constraints do not use count- 
ing quantifiers, so they are already expressible in the two- 
variable predicate logic (without counting). 

A note on usability of role logic. An anecdotal evi- 
dence of the usability of role logic is the fact that all results 
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C ::= A\CnC\-'C\>nR.C 

R ::= / I Jin Ji I -.Ji I ?7 I I JZ|C I id(C) 

A — atomic unary predicate 

/ — atomic binary predicate 

Figure 26: An Expressive Description Logic 

[A\ = A 

l>nR.C] = card^"([i?] A [C]) 

m = f 

[i?ini?2] = [Ri]A[R2j 

[-.R] = ^{Rj 

pj = true 

[R-'j = ^[Rj 

[id(C)l = EQA[C7] 

Figure 27: Translation of an Expressive Description Logic 
to Role Logic with Two Variables 

of [48] were initially shown using role logic uotatiou and then 
translated into the standard first-order logic notation. We 
have found the variable-free aspect of role logic convenient 
when showing the results of [48] . We have subsequently dis- 
covered the connection of role logic with [30], presented 
in Section 4, and the connection with description logics [1], 
presented in Section 5.3. 

5.3 Encoding an Expressive Description Logic 

Figure 26 presents an Expressive Description Logic fragment 
where roles have no transitive operators [1, Chapter 5]. Fig- 
ure 27 presents the translation of the Expressive Description 
Logic into RL^. The translation maps the concepts C and 
roles R of description logic into unary and binary predicates 
of role logic. The translation to RL^ in Figure 27 implies that 
the description logic in Figure 26 is dccidablc. The fact that 
interesting description logics can be translated to RL^ is not 
surprising once wo have established that RL'^ and have 
equal expressive power. Nevertheless, it is interesting to ob- 
serve the simplicity of the translation from the description 
logic to RL^, which is partly because both description logic 
and role logic avoid explicit occurrences of variables. 
Using rules 

[RioR^j = lRijo[R4 
[R*j = IRj* 

we can translate operations on binary relations into the full 
role logic, but not into the decidable fragment RL^. Decid- 



ability of interesting description logics that contain transi- 
tive closure but do not have tree model property is an open 
problem [1, Page 214]. 

A note on terminology. The term "role" has different 

meanings in different formalisms for describing structures. 
In [43], a role corresponds to a unary predicate (set), in de- 
scription logics [1], a role corresponds to a binary predicate 
(relation), and in entity-relationship diagrams in databases 
[16], a role corresponds to a position i (1 < i < n) in a 
n-tuples of an n-ary relation. To avoid the confusion, we 
use the well-established terms of n-ary "predicate" (or "re- 
lation"), keep the name "role logic" for the logic described 
in Figure 9, because the term "role logic" appears appro- 
priate regardless of the particular interpretation of the word 
"role". 

Description Logics Corresponding to C'^. ^ The re- 
sult [10, Theorem 4] reports that the description logic with- 
out transitive closure and relation composition (denoted 
I>£— {trans, compose}) corresponds precisely to C^. The 
results of Section 4 and [10] imply that our logic RL^ has 
the same expressive power as I?£ — {trans, compose}. One 
of the differences between RL^ and P£— {trans, compose} 
is that RL'^ contains the prime operator F' and does not 
contain the product operation of !>£— {trans, compose}. 
Another difference is the foundation of role logic on de 
Bruijn lambda calculus notation, as described in Section 3. 

6 Related Work 

We have initially developed role logic to provide a founda- 
tion for role analysis [43, 42]. We have subsequently stud- 
ied a simplification of role analysis constraints and showed 
a characterization of such constraints using formulas [46]. 
Parametric analysis based on three-valued logic was intro- 
duced in [64, 65] with interprocedural analysis in [61] and 
application to abstract data type verification in [52] . A char- 
acterization of dataflow facts used for shape analysis was 
presented in [71, 48]. A decidable logic for expressing con- 
nectivity properties of the heap was presented in [7]. 

Specifying the semantics of programs using predicates 
dates back to axiomatic program semantics [32, 24]. An 
approach that uses a first-order logic theorem prover tailed 
for program verification is [23]. 

Like [40, 39, 37, 55], in Section 5.1 we use an expres- 
sive yet decidable logic to encode fragments of straight-line 
code. Our approach differs primarily in using logic RL^ over 
general graphs whose decidability follows from the decid- 
ability of C^, where [40, 39, 37, 55] uses graph types whose 
decidability follows from the decidability of monadic second- 
order logic over trees. We expect that these two logics can 
be combined in a fruitful way. 

We have extended our language with constructs that 
make it possible to directly express higher-level state trans- 
formations, which is the idea related to the chemical reac- 
tion model of [26, 27], the verification of database transac- 
tions [6], the simultaneous assignments of [55], and in wide- 
spectrum languages [56, 3]. Verification of a form of mod- 
ifies clauses using a theorem prover was presented [50, 44]. 
Further approaches to pointer and shape analysis include 
[17, 68, 15, 29, 25, 28, 69]. 

^Note added on 31 October 2003, after becoming aware of [10]. 
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Description logics [1, 9] share many of the properties of 
role logic and have been traditionally applied to knowledge 
bases. It is likely that description logics can be used for 
shape analysis as well. It would be particularly interesting to 
consider description logics with transitive operators, whose 
decidability is related to the decidability of dynamic logic 
[31]. Reasoning about the satisfiability of expressive de- 
scription logics over all structures and over finite structures 
is presented in [13, 14]. Reasoning about entity-relationship 
diagrams [16] is presented in [51]. Some connections between 
object models and heap invariants are presented in [45, 35]. 

Like the Alloy modelling language [36], role logic com- 
bines the notation of predicate calculus with the notation of 
relational algebras. It may be possible to combine the nota- 
tion of Alloy with the notation of role logic, and to combine 
the benefits of bounded model checking used in Alloy Ana- 
lyzer with the benefits of a decision procedure for RL'^. 

A recent approach to reasoning about mutable impera- 
tive data structure is separation logic [34, 59, 60, 12, 11]. Wc 
are currently working on integrating some aspects of spatial 
logic to support more flexible notation for records in role 
logic. 

Interactive theorem provers have also been used for rea- 
soning about dynamically allocated data structures [54, 2] ; 
it may be interesting to incorporate a decision procedure for 
RL^ into these general tools. 

7 Conclusions 

We believe that role logic notation is a convenient way of 
expressing properties of first-order structures. First-order 
structures are a natural way to model the state in object- 
oriented programs, or a the state of a knowledge base or 
a database. Role logic can be combined with traditional 
variable-based notation in a natural way. Furthermore, in- 
teresting subsets of role logic are decidablc. Decision pro- 
cedures for role logic can therefore enable shape analysis of 
programs and have similar benefits as description logics in 
knowledge bases. 
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